Empirical Estimates of Adaptation: The chance of Two Noriegas is closer to p/2 than p2
نویسنده
چکیده
Repetition is very common. Adaptive language models, which allow probabilities to change or adapt after seeing just a few words of a text, were introduced in speech recognition to account for text cohesion. Suppose a document mentions Noriega once. What is the chance that he will be mentioned again? if the first instance has probability p, then under standard (bag-of words) independence assumptions, two instances ought to have probability p2, but we find the probability is actually closer to p/2. The first mention of a word obviously depends on frequency, but surprisingly, the second does not. Adaptation depends more on lexical content than fl'equency; there is more adaptation for content words (proper nouns, technical terminology and good keywords for information retrieval), and less adaptation for function words, cliches and ordinary first names.
منابع مشابه
Empirical Estimates of Adaptation : The chance of Two Noriegas is closer to p / 2 than p 2 Kenneth
Repetition is very common. Adaptive language models, which allow probabilities to change or adapt after seeing just a few words of a text, were introduced in speech recognition to account for text cohesion. Suppose a document mentions Noriega once. What is the chance that he will be mentioned again? If the first instance has probability p, then under standard (bag-of-words) independence assumpt...
متن کاملارزیابی دو روش تجربی و مدلهای شبکه عصبی مصنوعی برای برآورد تابش خورشید رسیده به زمین- مطالعه موردی در جنوب شرق تهران
Daily solar radiation intercepted at the earth’s surface is an input required for water resources, environmental and agricultural studies. However, the measurement of this parameter can only be done in a few places. This has led researchers to develop a number of methods for estimating solar radiation based on frequently available meteorological records such as hours of sunshine or air temperat...
متن کاملتخمین همزمان مارک ـ آپ و بازدهی نسبت به مقیاس در صنایع کارخانهای ایران
The current study is an attempt to estimate markup and return to scale of 19 two-digit ISIC manufacturing industries of Iran, simultaneously, in accordance to Solow Residual and Structural approach, during the period 1995-2007. Based on Solow Residual approach, the neoclassical assumption of constant return to scale is approved within 95% of manufacturing industries; however in 84% of industrie...
متن کاملComparison of different empirical methods for estimating ddaily reference evapotranspiration in the humid cold climate (case study: Borujen, Shahrekord, Koohrang and Lordegan)
The proposed method for calculation of potential evapotranspiration is Penman-Monteith FAO method, but there are other methods that require less meteorological data but estimates close to the FAO Penman-Monteith method in different climatic conditions. Performance evaluation of these methods on the same basis is prerequisite for selecting an alternative approach in accordance with available da...
متن کاملPillar Design in the Hard Rock Mines of South Africa
This paper gives an overview of the difficulties associated with the design of hard rock pillars in South African mines. Recent examples of large scale pillar collapses in South Africa suggest that these were caused by weak partings which traversed the pillars. Currently two different methods are used to determine the strength of pillars, namely, empirical equations derived from back analyses o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000